Effectiveness of Prompt Optimization in NL2SQL Systems

A Multi-Objective Approach to Improving Accuracy and Efficiency

Published

May 26, 2025

Authors: S. Gurajada et al.
Published on Arxiv: 2025-05-26
Link: http://arxiv.org/abs/2505.20591v1
Institutions: Megagon Labs • Adobe
Keywords: NL2SQL, large language models, prompt optimization, in-context learning, multi-objective optimization, schema pruning, SQL generation, BIRD dataset, GPT-4o, query latency

Random Unsplash-style image

Natural Language to SQL (NL2SQL) systems have leveraged large language models (LLMs) to enable translation of natural language queries into SQL statements for domain-specific databases. Traditionally, in-context learning (ICL) methods guide LLMs using carefully selected schema, cell values, and exemplars—typically chosen through retrieval-based techniques that add inference-time overhead. Prior work has focused on SQL generation quality, often overlooking efficiency and suitability for production deployment.

To address these efficiency challenges, the authors introduce their proposed solution:

Following this approach, the authors evaluate and present their main experimental results:

Finally, the paper draws several key conclusions from the findings: